R for Humanitarian Reporting

Cédric Vidonne

Jan 31, 2025

Agenda

  1. Why R for Humanitarian Reporting?
  2. Data Ingestion & Wrangling
  3. Data Visualization
  4. Reporting
  5. Q&A

Why R for Humanitarian Work?

What is R?

  • Free & Open Source – No licensing fees, continuously evolving.
  • Expandable with Packages – 20,000+ packages for everything from stats to mapping.
  • Active Community – Millions of users, tutorials, Stack Overflow, RStudio Community.
  • Tidyverse: A Game Changer – A unified approach to data manipulation & visualization.
  • ggplot2: Beautiful, Flexible Charts – Uses the Grammar of Graphics for layered plots.
  • RMarkdown & Quarto – Automate reports, mix text, code, visuals, and tables.

Product creation workflow

Workflow with R

Data Ingestion & Wrangling

Getting data into R

R can import data from various sources:

  • Local files: CSV, Excel, …
  • Databases: SQL, GeoDB, …
  • Online: APIs, Web scrapping, …
  • R packages: refugees, rnaturalearth, …

Local file

Example using a local CSV:

library(tidyverse)

sdn_pres <- 
  readr::read_csv("data/sdn_presence.csv")

head(sdn_pres)
# A tibble: 6 × 5
  pcode      gis_name        loc_type         lon   lat
  <chr>      <chr>           <chr>          <dbl> <dbl>
1 SDNp003704 Khashm El Girba Field Office    35.9  15.0
2 SDNp000540 Damazin         Field Office    34.3  11.8
3 SDNp010027 Kosti           Sub-Office      32.7  13.1
4 SDNp000545 El Fasher       Sub-Office      25.3  13.6
5 SDNp000568 Zalingei        Field Office    23.5  12.9
6 SDNp000328 Port Sudan      Country Office  37.2  19.6

From API

Example directly calling a file on HDX with rhdx package:

library(rhdx)

sdn_displ <- 
  rhdx::pull_dataset("unhcr-population-data-for-sdn") |>
  rhdx::get_resource(1) |> 
  rhdx::read_resource()

head(sdn_displ)
# A tibble: 6 × 24
   Year `Country of Origin Code` `Country of Asylum Code` Country of Origin Na…¹
  <dbl> <chr>                    <chr>                    <chr>                 
1  2001 SDN                      AGO                      Sudan                 
2  2001 SDN                      EGY                      Sudan                 
3  2001 SDN                      EGY                      Sudan                 
4  2001 SDN                      AUS                      Sudan                 
5  2001 SDN                      AUS                      Sudan                 
6  2001 SDN                      AUT                      Sudan                 
# ℹ abbreviated name: ¹​`Country of Origin Name`
# ℹ 20 more variables: `Country of Asylum Name` <chr>, `Population Type` <chr>,
#   location <chr>, urbanRural <chr>, accommodationType <chr>,
#   `Female 0-4` <dbl>, `Female 5-11` <dbl>, `Female 12-17` <dbl>,
#   `Female 18-59` <dbl>, `Female 60 or more` <dbl>, `Female Unknown` <dbl>,
#   `Female Total` <dbl>, `Male 0-4` <dbl>, `Male 5-11` <dbl>,
#   `Male 12-17` <dbl>, `Male 18-59` <dbl>, `Male 60 or more` <dbl>, …

R Packages

Example using data in the refugees package:

library(refugees)

wrl_displ <- 
  refugees::population

head(wrl_displ)
# A tibble: 6 × 16
   year coo_name coo   coo_iso coa_name  coa   coa_iso refugees asylum_seekers
  <dbl> <chr>    <chr> <chr>   <chr>     <chr> <chr>      <dbl>          <dbl>
1  1951 Unknown  UKN   UNK     Australia AUL   AUS       180000              0
2  1951 Unknown  UKN   UNK     Austria   AUS   AUT       282000              0
3  1951 Unknown  UKN   UNK     Belgium   BEL   BEL        55000              0
4  1951 Unknown  UKN   UNK     Canada    CAN   CAN       168511              0
5  1951 Unknown  UKN   UNK     Denmark   DEN   DNK         2000              0
6  1951 Unknown  UKN   UNK     France    FRA   FRA       290000              0
# ℹ 7 more variables: returned_refugees <dbl>, idps <dbl>, returned_idps <dbl>,
#   stateless <dbl>, ooc <dbl>, oip <dbl>, hst <dbl>

Data Visualization

Reporting

Questions